Recensement
Contents
Recensement¶
Dataviz sur les données du recensement 2019 avec recoupement sur les logements
!pip install pandas seaborn dataprep
Requirement already satisfied: pandas in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (1.4.2)
Requirement already satisfied: seaborn in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (0.11.2)
Requirement already satisfied: dataprep in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (0.4.3)
Requirement already satisfied: numpy>=1.18.5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from pandas) (1.22.3)
Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from pandas) (2022.1)
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from pandas) (2.8.2)
Requirement already satisfied: matplotlib>=2.2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from seaborn) (3.5.2)
Requirement already satisfied: scipy>=1.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from seaborn) (1.7.1)
Requirement already satisfied: bokeh<3,>=2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (2.4.2)
Requirement already satisfied: ipywidgets<8.0,>=7.5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (7.7.0)
Requirement already satisfied: python-stdnum<2.0,>=1.16 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (1.17)
Requirement already satisfied: varname<0.9.0,>=0.8.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (0.8.3)
Requirement already satisfied: nltk<4.0.0,>=3.6.7 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (3.7)
Requirement already satisfied: wordcloud<2.0,>=1.8 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (1.8.1)
Requirement already satisfied: jinja2<4,>=3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (3.0.3)
Requirement already satisfied: flask<3,>=2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (2.1.2)
Requirement already satisfied: aiohttp<4.0,>=3.6 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (3.8.1)
Requirement already satisfied: jsonpath-ng<2.0,>=1.5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (1.5.3)
Requirement already satisfied: regex<2022.0.0,>=2021.8.3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (2021.11.10)
Requirement already satisfied: flask_cors<4.0.0,>=3.0.10 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (3.0.10)
Requirement already satisfied: pydantic<2.0,>=1.6 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (1.9.0)
Requirement already satisfied: tqdm<5.0,>=4.48 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (4.64.0)
Requirement already satisfied: pystache<0.7.0,>=0.6.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (0.6.0)
Requirement already satisfied: dask[array,dataframe,delayed]<2022.0,>=2021.11 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (2021.12.0)
Requirement already satisfied: python-crfsuite<0.10.0,>=0.9.7 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (0.9.8)
Requirement already satisfied: metaphone<0.7,>=0.6 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (0.6)
Requirement already satisfied: python-Levenshtein<0.13.0,>=0.12.2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (0.12.2)
Requirement already satisfied: sqlalchemy<2.0.0,>=1.4.32 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dataprep) (1.4.36)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (6.0.2)
Requirement already satisfied: frozenlist>=1.1.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (1.3.0)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (4.0.2)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (1.7.2)
Requirement already satisfied: aiosignal>=1.1.2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (1.2.0)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (2.0.12)
Requirement already satisfied: attrs>=17.3.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from aiohttp<4.0,>=3.6->dataprep) (21.4.0)
Requirement already satisfied: PyYAML>=3.10 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from bokeh<3,>=2->dataprep) (6.0)
Requirement already satisfied: packaging>=16.8 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from bokeh<3,>=2->dataprep) (21.3)
Requirement already satisfied: tornado>=5.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from bokeh<3,>=2->dataprep) (6.1)
Requirement already satisfied: typing-extensions>=3.10.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from bokeh<3,>=2->dataprep) (4.2.0)
Requirement already satisfied: pillow>=7.1.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from bokeh<3,>=2->dataprep) (9.1.0)
Requirement already satisfied: fsspec>=0.6.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dask[array,dataframe,delayed]<2022.0,>=2021.11->dataprep) (2022.3.0)
Requirement already satisfied: partd>=0.3.10 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dask[array,dataframe,delayed]<2022.0,>=2021.11->dataprep) (1.2.0)
Requirement already satisfied: cloudpickle>=1.1.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dask[array,dataframe,delayed]<2022.0,>=2021.11->dataprep) (2.0.0)
Requirement already satisfied: toolz>=0.8.2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from dask[array,dataframe,delayed]<2022.0,>=2021.11->dataprep) (0.11.2)
Requirement already satisfied: importlib-metadata>=3.6.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from flask<3,>=2->dataprep) (4.11.3)
Requirement already satisfied: itsdangerous>=2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from flask<3,>=2->dataprep) (2.1.2)
Requirement already satisfied: Werkzeug>=2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from flask<3,>=2->dataprep) (2.1.2)
Requirement already satisfied: click>=8.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from flask<3,>=2->dataprep) (8.1.3)
Requirement already satisfied: Six in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from flask_cors<4.0.0,>=3.0.10->dataprep) (1.16.0)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (1.1.0)
Requirement already satisfied: nbformat>=4.2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (5.4.0)
Requirement already satisfied: ipykernel>=4.5.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (6.13.0)
Requirement already satisfied: traitlets>=4.3.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (5.2.0)
Requirement already satisfied: ipython-genutils~=0.2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (0.2.0)
Requirement already satisfied: ipython>=4.0.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (8.3.0)
Requirement already satisfied: widgetsnbextension~=3.6.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipywidgets<8.0,>=7.5->dataprep) (3.6.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jinja2<4,>=3->dataprep) (2.1.1)
Requirement already satisfied: decorator in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jsonpath-ng<2.0,>=1.5->dataprep) (5.1.1)
Requirement already satisfied: ply in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jsonpath-ng<2.0,>=1.5->dataprep) (3.11)
Requirement already satisfied: cycler>=0.10 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from matplotlib>=2.2->seaborn) (0.11.0)
Requirement already satisfied: pyparsing>=2.2.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from matplotlib>=2.2->seaborn) (3.0.9)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from matplotlib>=2.2->seaborn) (1.4.2)
Requirement already satisfied: fonttools>=4.22.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from matplotlib>=2.2->seaborn) (4.33.3)
Requirement already satisfied: joblib in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nltk<4.0.0,>=3.6.7->dataprep) (1.1.0)
Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from python-Levenshtein<0.13.0,>=0.12.2->dataprep) (56.0.0)
Requirement already satisfied: greenlet!=0.4.17 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from sqlalchemy<2.0.0,>=1.4.32->dataprep) (1.1.2)
Requirement already satisfied: executing<0.9.0,>=0.8.3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from varname<0.9.0,>=0.8.1->dataprep) (0.8.3)
Requirement already satisfied: pure_eval<1.0.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from varname<0.9.0,>=0.8.1->dataprep) (0.2.2)
Requirement already satisfied: asttokens<3.0.0,>=2.0.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from varname<0.9.0,>=0.8.1->dataprep) (2.0.5)
Requirement already satisfied: zipp>=0.5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from importlib-metadata>=3.6.0->flask<3,>=2->dataprep) (3.8.0)
Requirement already satisfied: debugpy>=1.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (1.6.0)
Requirement already satisfied: jupyter-client>=6.1.12 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (7.3.1)
Requirement already satisfied: matplotlib-inline>=0.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (0.1.3)
Requirement already satisfied: psutil in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (5.9.0)
Requirement already satisfied: nest-asyncio in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (1.5.5)
Requirement already satisfied: backcall in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.2.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (3.0.29)
Requirement already satisfied: pygments>=2.4.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (2.12.0)
Requirement already satisfied: jedi>=0.16 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.18.1)
Requirement already satisfied: stack-data in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.2.0)
Requirement already satisfied: pickleshare in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.7.5)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (4.8.0)
Requirement already satisfied: jsonschema>=2.6 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbformat>=4.2.0->ipywidgets<8.0,>=7.5->dataprep) (3.2.0)
Requirement already satisfied: jupyter-core in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbformat>=4.2.0->ipywidgets<8.0,>=7.5->dataprep) (4.10.0)
Requirement already satisfied: fastjsonschema in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbformat>=4.2.0->ipywidgets<8.0,>=7.5->dataprep) (2.15.3)
Requirement already satisfied: locket in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from partd>=0.3.10->dask[array,dataframe,delayed]<2022.0,>=2021.11->dataprep) (1.0.0)
Requirement already satisfied: notebook>=4.4.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (6.4.11)
Requirement already satisfied: idna>=2.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from yarl<2.0,>=1.0->aiohttp<4.0,>=3.6->dataprep) (3.3)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jedi>=0.16->ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.8.3)
Requirement already satisfied: pyrsistent>=0.14.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jsonschema>=2.6->nbformat>=4.2.0->ipywidgets<8.0,>=7.5->dataprep) (0.18.1)
Requirement already satisfied: entrypoints in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (0.4)
Requirement already satisfied: pyzmq>=22.3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets<8.0,>=7.5->dataprep) (22.3.0)
Requirement already satisfied: prometheus-client in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.14.1)
Requirement already satisfied: nbconvert>=5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (6.5.0)
Requirement already satisfied: Send2Trash>=1.8.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (1.8.0)
Requirement already satisfied: terminado>=0.8.3 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.13.3)
Requirement already satisfied: argon2-cffi in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (21.3.0)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from pexpect>4.3->ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.7.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0->ipywidgets<8.0,>=7.5->dataprep) (0.2.5)
Requirement already satisfied: defusedxml in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.7.1)
Requirement already satisfied: nbclient>=0.5.0 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.5.13)
Requirement already satisfied: beautifulsoup4 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (4.11.1)
Requirement already satisfied: mistune<2,>=0.8.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.8.4)
Requirement already satisfied: pandocfilters>=1.4.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (1.5.0)
Requirement already satisfied: bleach in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (5.0.0)
Requirement already satisfied: jupyterlab-pygments in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.2.2)
Requirement already satisfied: tinycss2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (1.1.1)
Requirement already satisfied: argon2-cffi-bindings in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (21.2.0)
Requirement already satisfied: cffi>=1.0.1 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (1.15.0)
Requirement already satisfied: soupsieve>1.2 in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from beautifulsoup4->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (2.3.2.post1)
Requirement already satisfied: webencodings in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from bleach->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (0.5.1)
Requirement already satisfied: pycparser in /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets<8.0,>=7.5->dataprep) (2.21)
WARNING: You are using pip version 22.0.4; however, version 22.1 is available.
You should consider upgrading via the '/opt/hostedtoolcache/Python/3.8.12/x64/bin/python -m pip install --upgrade pip' command.
!wget "https://data.gouv.nc/explore/dataset/rp-2019-indv-psud/download/?format=csv&timezone=Pacific/Noumea&lang=fr&use_labels_for_header=true&csv_separator=%2C" -O data/recensement_individus.csv
--2022-05-13 04:04:05-- https://data.gouv.nc/explore/dataset/rp-2019-indv-psud/download/?format=csv&timezone=Pacific/Noumea&lang=fr&use_labels_for_header=true&csv_separator=%2C
Resolving data.gouv.nc (data.gouv.nc)... 13.211.119.48, 13.55.171.246
Connecting to data.gouv.nc (data.gouv.nc)|13.211.119.48|:443...
connected.
HTTP request sent, awaiting response...
200 OK
Length: unspecified [application/csv]
Saving to: ‘data/recensement_individus.csv’
data/rece [<=> ] 0 --.-KB/s
data/recen [ <=> ] 100.67K 258KB/s
data/recens [ <=> ] 228.64K 298KB/s
data/recense [ <=> ] 292.62K 259KB/s
data/recensem [ <=> ] 404.60K 265KB/s
data/recenseme [ <=> ] 500.57K 259KB/s
data/recensemen [ <=> ] 612.55K 260KB/s
data/recensement [ <=> ] 668.55K 239KB/s
data/recensement_ [ <=> ] 764.55K 234KB/s
data/recensement_i [ <=> ] 860.55K 230KB/s
data/recensement_in [ <=> ] 924.55K 220KB/s
ata/recensement_ind [ <=> ] 1021K 220KB/s
ta/recensement_indi [ <=> ] 1.09M 213KB/s
a/recensement_indiv [ <=> ] 1.18M 214KB/s
/recensement_indivi [ <=> ] 1.25M 207KB/s
recensement_individ [ <=> ] 1.34M 208KB/s
ecensement_individu [ <=> ] 1.43M 208KB/s
censement_individus [ <=>] 1.50M 207KB/s
ensement_individus. [ <=> ] 1.59M 210KB/s
nsement_individus.c [ <=> ] 1.68M 211KB/s
sement_individus.cs [ <=> ] 1.78M 210KB/s
ement_individus.csv [ <=> ] 1.84M 203KB/s
ment_individus.csv [ <=> ] 1.93M 207KB/s
ent_individus.csv [ <=> ] 2.03M 212KB/s
nt_individus.csv [ <=> ] 2.09M 208KB/s
t_individus.csv [ <=> ] 2.18M 209KB/s
_individus.csv [ <=> ] 2.28M 209KB/s
individus.csv [ <=> ] 2.34M 213KB/s
ndividus.csv [ <=> ] 2.43M 216KB/s
dividus.csv [ <=> ] 2.53M 220KB/s
ividus.csv [ <=> ] 2.62M 227KB/s
vidus.csv [ <=> ] 2.68M 226KB/s
idus.csv [ <=> ] 2.78M 232KB/s
dus.csv [ <=> ] 2.87M 234KB/s
us.csv [<=> ] 2.93M 239KB/s
s.csv [ <=> ] 3.03M 236KB/s
.csv [ <=> ] 3.12M 240KB/s
csv [ <=> ] 3.18M 241KB/s
sv [ <=> ] 3.28M 240KB/s
v [ <=> ] 3.37M 240KB/s
[ <=> ] 3.43M 235KB/s
d [ <=> ] 3.53M 240KB/s
da [ <=> ] 3.62M 237KB/s
dat [ <=> ] 3.68M 232KB/s
data [ <=> ] 3.78M 233KB/s
data/ [ <=> ] 3.87M 232KB/s
data/r [ <=> ] 3.93M 224KB/s
data/re [ <=> ] 4.03M 226KB/s
data/rec [ <=> ] 4.12M 223KB/s
data/rece [ <=> ] 4.18M 219KB/s
data/recen [ <=> ] 4.28M 217KB/s
data/recens [ <=>] 4.37M 221KB/s
data/recense [ <=> ] 4.43M 218KB/s
data/recensem [ <=> ] 4.53M 218KB/s
data/recenseme [ <=> ] 4.62M 222KB/s
data/recensemen [ <=> ] 4.71M 227KB/s
data/recensement [ <=> ] 4.78M 223KB/s
data/recensement_ [ <=> ] 4.87M 224KB/s
data/recensement_i [ <=> ] 4.96M 224KB/s
data/recensement_in [ <=> ] 5.03M 222KB/s
ata/recensement_ind [ <=> ] 5.12M 227KB/s
ta/recensement_indi [ <=> ] 5.21M 227KB/s
a/recensement_indiv [ <=> ] 5.28M 224KB/s
/recensement_indivi [ <=> ] 5.37M 229KB/s
recensement_individ [ <=> ] 5.46M 231KB/s
ecensement_individu [ <=> ] 5.53M 228KB/s
censement_individus [ <=> ] 5.62M 236KB/s
ensement_individus. [ <=> ] 5.71M 239KB/s
nsement_individus.c [<=> ] 5.78M 237KB/s
sement_individus.cs [ <=> ] 5.87M 243KB/s
ement_individus.csv [ <=> ] 5.96M 244KB/s
ment_individus.csv [ <=> ] 6.06M 246KB/s
ent_individus.csv [ <=> ] 6.12M 246KB/s
nt_individus.csv [ <=> ] 6.21M 245KB/s
t_individus.csv [ <=> ] 6.31M 226KB/s
_individus.csv [ <=> ] 6.37M 219KB/s
individus.csv [ <=> ] 6.46M 217KB/s
ndividus.csv [ <=> ] 6.56M 224KB/s
dividus.csv [ <=> ] 6.65M 224KB/s
ividus.csv [ <=> ] 6.71M 218KB/s
vidus.csv [ <=> ] 6.81M 217KB/s
idus.csv [ <=> ] 6.90M 214KB/s
dus.csv [ <=> ] 6.96M 212KB/s
us.csv [ <=> ] 7.06M 210KB/s
s.csv [ <=> ] 7.15M 207KB/s
.csv [ <=>] 7.21M 205KB/s
csv [ <=> ] 7.31M 203KB/s
sv [ <=> ] 7.40M 200KB/s
v [ <=> ] 7.50M 197KB/s
[ <=> ] 7.56M 194KB/s
d [ <=> ] 7.65M 191KB/s
da [ <=> ] 7.75M 207KB/s
dat [ <=> ] 7.81M 207KB/s
data [ <=> ] 7.90M 213KB/s
data/ [ <=> ] 8.00M 210KB/s
data/r [ <=> ] 8.06M 215KB/s
data/re [ <=> ] 8.15M 212KB/s
data/rec [ <=> ] 8.25M 211KB/s
data/rece [ <=> ] 8.34M 216KB/s
data/recen [ <=> ] 8.40M 211KB/s
data/recens [ <=> ] 8.50M 219KB/s
data/recense [ <=> ] 8.59M 222KB/s
data/recensem [<=> ] 8.65M 224KB/s
data/recenseme [ <=> ] 8.75M 227KB/s
data/recensemen [ <=> ] 8.84M 231KB/s
data/recensement [ <=> ] 8.93M 238KB/s
data/recensement_ [ <=> ] 9.00M 235KB/s
data/recensement_i [ <=> ] 9.09M 237KB/s
data/recensement_in [ <=> ] 9.18M 240KB/s
ata/recensement_ind [ <=> ] 9.25M 244KB/s
ta/recensement_indi [ <=> ] 9.34M 249KB/s
a/recensement_indiv [ <=> ] 9.43M 251KB/s
/recensement_indivi [ <=> ] 9.50M 250KB/s
recensement_individ [ <=> ] 9.59M 251KB/s
ecensement_individu [ <=> ] 9.68M 253KB/s
censement_individus [ <=> ] 9.78M 257KB/s
ensement_individus. [ <=> ] 9.84M 252KB/s
nsement_individus.c [ <=> ] 9.93M 254KB/s
sement_individus.cs [ <=> ] 10.03M 254KB/s
ement_individus.csv [ <=>] 10.09M 256KB/s
ment_individus.csv [ <=> ] 10.18M 253KB/s
ent_individus.csv [ <=> ] 10.28M 249KB/s
nt_individus.csv [ <=> ] 10.31M 246KB/s
t_individus.csv [ <=> ] 10.34M 234KB/s
_individus.csv [ <=> ] 10.43M 226KB/s
individus.csv [ <=> ] 10.46M 219KB/s
ndividus.csv [ <=> ] 10.53M 218KB/s
dividus.csv [ <=> ] 10.56M 212KB/s
ividus.csv [ <=> ] 10.59M 205KB/s
vidus.csv [ <=> ] 10.65M 206KB/s
idus.csv [ <=> ] 10.68M 196KB/s
dus.csv [ <=> ] 10.71M 185KB/s
us.csv [ <=> ] 10.78M 185KB/s
s.csv [ <=> ] 10.87M 173KB/s
.csv [ <=> ] 10.90M 165KB/s
csv [ <=> ] 10.93M 165KB/s
sv [<=> ] 11.00M 162KB/s
v [ <=> ] 11.03M 151KB/s
[ <=> ] 11.12M 142KB/s
d [ <=> ] 11.15M 136KB/s
da [ <=> ] 11.18M 135KB/s
dat [ <=> ] 11.25M 140KB/s
data [ <=> ] 11.28M 134KB/s
data/ [ <=> ] 11.31M 134KB/s
data/r [ <=> ] 11.37M 131KB/s
data/re [ <=> ] 11.40M 132KB/s
data/rec [ <=> ] 11.43M 131KB/s
data/rece [ <=> ] 11.50M 130KB/s
data/recen [ <=> ] 11.53M 126KB/s
data/recens [ <=> ] 11.56M 128KB/s
data/recense [ <=> ] 11.62M 125KB/s
data/recensem [ <=> ] 11.65M 121KB/s
data/recenseme [ <=> ] 11.68M 114KB/s
data/recensemen [ <=>] 11.75M 119KB/s
data/recensement [ <=> ] 11.78M 114KB/s
data/recensement_ [ <=> ] 11.81M 107KB/s
data/recensement_i [ <=> ] 11.87M 106KB/s
data/recensement_in [ <=> ] 11.90M 103KB/s
ata/recensement_ind [ <=> ] 11.96M 104KB/s
ta/recensement_indi [ <=> ] 12.00M 102KB/s
a/recensement_indiv [ <=> ] 12.03M 95.0KB/s
/recensement_indivi [ <=> ] 12.03M 89.9KB/s
recensement_individ [ <=> ] 12.12M 90.6KB/s
ecensement_individu [ <=> ] 12.15M 89.0KB/s
censement_individus [ <=> ] 12.21M 89.6KB/s
ensement_individus. [ <=> ] 12.25M 90.8KB/s
nsement_individus.c [ <=> ] 12.28M 87.6KB/s
sement_individus.cs [ <=> ] 12.34M 91.9KB/s
ement_individus.csv [ <=> ] 12.37M 91.5KB/s
ment_individus.csv [ <=> ] 12.40M 90.8KB/s
ent_individus.csv [<=> ] 12.46M 94.8KB/s
nt_individus.csv [ <=> ] 12.50M 98.1KB/s
t_individus.csv [ <=> ] 12.56M 98.9KB/s
_individus.csv [ <=> ] 12.59M 100KB/s
individus.csv [ <=> ] 12.62M 101KB/s
ndividus.csv [ <=> ] 12.68M 103KB/s
dividus.csv [ <=> ] 12.71M 106KB/s
ividus.csv [ <=> ] 12.75M 107KB/s
vidus.csv [ <=> ] 12.81M 108KB/s
idus.csv [ <=> ] 12.84M 110KB/s
dus.csv [ <=> ] 12.87M 122KB/s
us.csv [ <=> ] 12.93M 126KB/s
s.csv [ <=> ] 12.96M 129KB/s
.csv [ <=> ] 13.00M 131KB/s
csv [ <=> ] 13.06M 135KB/s
sv [ <=> ] 13.09M 136KB/s
v [ <=> ] 13.12M 139KB/s
[ <=>] 13.18M 142KB/s
d [ <=> ] 13.21M 139KB/s
da [ <=> ] 13.25M 141KB/s
dat [ <=> ] 13.31M 141KB/s
data [ <=> ] 13.34M 140KB/s
data/ [ <=> ] 13.37M 134KB/s
data/r [ <=> ] 13.43M 139KB/s
data/re [ <=> ] 13.46M 137KB/s
data/rec [ <=> ] 13.56M 141KB/s
data/rece [ <=> ] 13.65M 150KB/s
data/recen [ <=> ] 13.71M 154KB/s
data/recens [ <=> ] 13.81M 163KB/s
data/recense [ <=> ] 13.90M 174KB/s
data/recensem [ <=> ] 13.96M 170KB/s
data/recenseme [ <=> ] 14.06M 182KB/s
data/recensemen [ <=> ] 14.15M 187KB/s
data/recensement [ <=> ] 14.25M 195KB/s
data/recensement_ [<=> ] 14.31M 198KB/s
data/recensement_i [ <=> ] 14.40M 199KB/s
data/recensement_in [ <=> ] 14.50M 208KB/s
ata/recensement_ind [ <=> ] 14.56M 208KB/s
ta/recensement_indi [ <=> ] 14.65M 214KB/s
a/recensement_indiv [ <=> ] 14.75M 224KB/s
/recensement_indivi [ <=> ] 14.84M 237KB/s
recensement_individ [ <=> ] 14.90M 235KB/s
ecensement_individu [ <=> ] 15.00M 252KB/s
censement_individus [ <=> ] 15.09M 251KB/s
ensement_individus. [ <=> ] 15.15M 251KB/s
nsement_individus.c [ <=> ] 15.25M 250KB/s
sement_individus.cs [ <=> ] 15.34M 254KB/s
ement_individus.csv [ <=> ] 15.40M 243KB/s
ment_individus.csv [ <=> ] 15.50M 244KB/s
ent_individus.csv [ <=> ] 15.59M 248KB/s
nt_individus.csv [ <=> ] 15.68M 249KB/s
t_individus.csv [ <=>] 15.75M 240KB/s
_individus.csv [ <=> ] 15.84M 250KB/s
individus.csv [ <=> ] 15.93M 252KB/s
ndividus.csv [ <=> ] 16.00M 254KB/s
dividus.csv [ <=> ] 16.09M 256KB/s
ividus.csv [ <=> ] 16.18M 257KB/s
vidus.csv [ <=> ] 16.28M 265KB/s
idus.csv [ <=> ] 16.34M 260KB/s
dus.csv [ <=> ] 16.43M 260KB/s
us.csv [ <=> ] 16.53M 259KB/s
s.csv [ <=> ] 16.59M 259KB/s
.csv [ <=> ] 16.68M 260KB/s
csv [ <=> ] 16.78M 261KB/s
sv [ <=> ] 16.84M 259KB/s
v [ <=> ] 16.93M 256KB/s
[ <=> ] 17.03M 256KB/s
d [ <=> ] 17.12M 273KB/s
data/recensement_in [<=> ] 17.14M 276KB/s in 89s
2022-05-13 04:05:35 (198 KB/s) - ‘data/recensement_individus.csv’ saved [17971968]
!wget "https://data.gouv.nc/explore/dataset/rp-2019-logements/download/?format=csv&timezone=Pacific/Noumea&lang=fr&use_labels_for_header=true&csv_separator=%2C" -O data/recensement_logements.csv
--2022-05-13 04:05:35-- https://data.gouv.nc/explore/dataset/rp-2019-logements/download/?format=csv&timezone=Pacific/Noumea&lang=fr&use_labels_for_header=true&csv_separator=%2C
Resolving data.gouv.nc (data.gouv.nc)... 13.211.119.48, 13.55.171.246
Connecting to data.gouv.nc (data.gouv.nc)|13.211.119.48|:443...
connected.
HTTP request sent, awaiting response...
200 OK
Length: unspecified [application/csv]
Saving to: ‘data/recensement_logements.csv’
data/rece [<=> ] 0 --.-KB/s
data/recen [ <=> ] 100.67K 258KB/s
data/recens [ <=> ] 276.63K 330KB/s
data/recense [ <=> ] 356.61K 269KB/s
data/recensem [ <=> ] 468.59K 250KB/s
data/recenseme [ <=> ] 596.57K 250KB/s
data/recensemen [ <=> ] 708.55K 246KB/s
data/recensement [ <=> ] 796.55K 234KB/s
data/recensement_ [ <=> ] 892.55K 236KB/s
data/recensement_l [ <=> ] 988.55K 237KB/s
data/recensement_lo [ <=> ] 1.06M 239KB/s
ata/recensement_log [ <=> ] 1.18M 245KB/s
ta/recensement_loge [ <=> ] 1.28M 245KB/s
a/recensement_logem [ <=> ] 1.37M 241KB/s
/recensement_logeme [ <=> ] 1.46M 238KB/s
recensement_logemen [ <=> ] 1.56M 240KB/s
ecensement_logement [ <=> ] 1.65M 241KB/s
censement_logements [ <=>] 1.75M 244KB/s
ensement_logements. [ <=> ] 1.87M 247KB/s
nsement_logements.c [ <=> ] 1.96M 239KB/s
sement_logements.cs [ <=> ] 2.06M 241KB/s
ement_logements.csv [ <=> ] 2.15M 245KB/s
ment_logements.csv [ <=> ] 2.25M 245KB/s
ent_logements.csv [ <=> ] 2.34M 248KB/s
nt_logements.csv [ <=> ] 2.46M 260KB/s
t_logements.csv [ <=> ] 2.56M 260KB/s
_logements.csv [ <=> ] 2.65M 261KB/s
logements.csv [ <=> ] 2.75M 258KB/s
ogements.csv [ <=> ] 2.84M 262KB/s
gements.csv [ <=> ] 2.93M 258KB/s
ements.csv [ <=> ] 3.06M 265KB/s
ments.csv [ <=> ] 3.15M 267KB/s
ents.csv [ <=> ] 3.25M 271KB/s
nts.csv [ <=> ] 3.34M 270KB/s
ts.csv [<=> ] 3.43M 269KB/s
s.csv [ <=> ] 3.53M 268KB/s
.csv [ <=> ] 3.65M 266KB/s
csv [ <=> ] 3.75M 266KB/s
sv [ <=> ] 3.84M 265KB/s
v [ <=> ] 3.93M 267KB/s
[ <=> ] 4.03M 265KB/s
d [ <=> ] 4.12M 263KB/s
da [ <=> ] 4.25M 260KB/s
dat [ <=> ] 4.34M 251KB/s
data [ <=> ] 4.43M 251KB/s
data/ [ <=> ] 4.53M 249KB/s
data/r [ <=> ] 4.62M 248KB/s
data/re [ <=> ] 4.71M 249KB/s
data/rec [ <=> ] 4.84M 252KB/s
data/rece [ <=> ] 4.93M 252KB/s
data/recen [ <=> ] 5.03M 248KB/s
data/recens [ <=>] 5.12M 249KB/s
data/recense [ <=> ] 5.21M 246KB/s
data/recensem [ <=> ] 5.31M 247KB/s
data/recenseme [ <=> ] 5.43M 252KB/s
data/recensemen [ <=> ] 5.53M 253KB/s
data/recensement [ <=> ] 5.62M 250KB/s
data/recensement_ [ <=> ] 5.71M 250KB/s
data/recensement_l [ <=> ] 5.81M 252KB/s
data/recensement_lo [ <=> ] 5.90M 254KB/s
ata/recensement_log [ <=> ] 6.03M 261KB/s
ta/recensement_loge [ <=> ] 6.12M 264KB/s
a/recensement_logem [ <=> ] 6.21M 268KB/s
/recensement_logeme [ <=> ] 6.31M 269KB/s
recensement_logemen [ <=> ] 6.40M 265KB/s
ecensement_logement [ <=> ] 6.50M 265KB/s
censement_logements [ <=> ] 6.59M 265KB/s
ensement_logements. [ <=> ] 6.71M 266KB/s
nsement_logements.c [<=> ] 6.81M 267KB/s
sement_logements.cs [ <=> ] 6.90M 266KB/s
ement_logements.csv [ <=> ] 7.00M 267KB/s
ment_logements.csv [ <=> ] 7.09M 269KB/s
ent_logements.csv [ <=> ] 7.18M 270KB/s
nt_logements.csv [ <=> ] 7.31M 268KB/s
t_logements.csv [ <=> ] 7.40M 267KB/s
_logements.csv [ <=> ] 7.50M 265KB/s
logements.csv [ <=> ] 7.59M 264KB/s
ogements.csv [ <=> ] 7.68M 258KB/s
gements.csv [ <=> ] 7.78M 259KB/s
ements.csv [ <=> ] 7.90M 260KB/s
ments.csv [ <=> ] 8.00M 255KB/s
ents.csv [ <=> ] 8.09M 255KB/s
nts.csv [ <=> ] 8.18M 253KB/s
ts.csv [ <=> ] 8.28M 252KB/s
s.csv [ <=> ] 8.37M 256KB/s
.csv [ <=>] 8.50M 259KB/s
csv [ <=> ] 8.59M 254KB/s
sv [ <=> ] 8.68M 250KB/s
v [ <=> ] 8.78M 250KB/s
[ <=> ] 8.87M 251KB/s
d [ <=> ] 8.96M 247KB/s
da [ <=> ] 9.06M 248KB/s
dat [ <=> ] 9.18M 251KB/s
data [ <=> ] 9.28M 248KB/s
data/ [ <=> ] 9.37M 249KB/s
data/r [ <=> ] 9.46M 251KB/s
data/re [ <=> ] 9.56M 253KB/s
data/rec [ <=> ] 9.65M 257KB/s
data/rece [ <=> ] 9.78M 258KB/s
data/recen [ <=> ] 9.87M 259KB/s
data/recens [ <=> ] 9.96M 259KB/s
data/recense [ <=> ] 10.06M 262KB/s
data/recensem [<=> ] 10.15M 264KB/s
data/recenseme [ <=> ] 10.25M 266KB/s
data/recensemen [ <=> ] 10.37M 273KB/s
data/recensement [ <=> ] 10.46M 272KB/s
data/recensement_ [ <=> ] 10.56M 270KB/s
data/recensement_l [ <=> ] 10.65M 267KB/s
data/recensement_lo [ <=> ] 10.77M 284KB/s in 43s
2022-05-13 04:06:19 (259 KB/s) - ‘data/recensement_logements.csv’ saved [11293731]
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv("data/recensement_individus.csv")
df['COUPLE'] = df['COUPLE'].astype("category")
df['COUPLE'] = df['COUPLE'].cat.rename_categories({1: 'Vit en couple', 2: 'ne vit pas en couple'})
df['CS24'] = df['CS24'].astype("category")
df['CS24'] = df['CS24'].cat.rename_categories({10: 'Agriculteurs exploitants', 21: 'Artisans', 22: 'Commerçants et assimilés', 23: 'Chefs d\'entreprise de 10 salariés ou plus',
31: 'Professions libérales et assimilés', 32: 'Cadres de la fonction publique, professions intellectuelles et artistiques',
36: 'Cadres d\'entreprise', 41: 'Professions intermédiaires de l\'enseignement, de la santé, de la fonction publique et assimilés',
46: 'Professions intermédiaires administratives et commerciales des entreprises', 47: 'Techniciens',
48: 'Contremaîtres, agents de maîtrise', 51: 'Employés de la fonction publique',
54: 'Employés administratifs d\'entreprise', 55: 'Employés de commerce', 56: 'Personnels des services directs aux particuliers',
61: 'Ouvriers qualifiés', 66: 'Ouvriers non qualifiés', 69: 'Ouvriers agricoles'})
df['CS42'] = df['CS42'].astype("category")
df['CS42'] = df['CS42'].cat.rename_categories({11: 'Agriculteurs sur petites exploitations', 12: 'Agriculteurs sur moyennes exploitations', 13: 'Agriculteurs sur grandes exploitations',
21: 'Artisans', 22: 'Commerçants et assimilés' ,23: 'Chefs d\'entreprise de 10 salariés ou plus',
31: 'Professions libérales et assimilés', 33: 'Cadres de la fonction publique', 34: 'Professeurs, professions scientifiques',
35: 'Professions de l\'information, des arts et des spectacles', 37: 'Cadres administratifs et commerciaux d\'entreprise',
38: 'Ingénieurs et cadres techniques d\'entreprise', 42: 'Professeurs des écoles, instituteurs et assimilés',
43: 'Professions intermédiaires de la santé et du travail social',
44: 'Clergé, religieux', 45: 'Professions intermédiaires administratives de la fonction publique',
46: 'Professions intermédiaires administratives et commerciales des entreprises', 47: 'Techniciens',
48: 'Contremaîtres, agents de maîtrise', 52: 'Employés civils et agents de service de la fonction publique',
53: 'Policiers et militaires', 54: 'Employés administratifs d\'entreprise', 55: 'Employés de commerce',
56: 'Personnels des services directs aux particuliers', 62: 'Ouvriers qualifiés de type industriel',
63: 'Ouvriers qualifiés de type artisanal', 64: 'Chauffeurs', 65: 'Ouvriers qualifiés de la manutention, du magasinage et du transport',
67: 'Ouvriers non qualifiés de type industriel', 68: 'Ouvriers non qualifiés de type artisanal',
69: 'Ouvriers agricoles'})
df['CS8'] = df['CS8'].astype("category")
df['CS8'] = df['CS8'].cat.rename_categories({1: 'Agriculteurs exploitants', 2: 'Artisans, commerçants et chefs d\'entreprise',
3: 'Cadres et professions intellectuelles supérieures', 4 : 'Professions Intermédiaires',
5: 'Employés', 6: 'Ouvriers'})
df['CSSAL'] = df['CSSAL'].astype("category")
df['CSSAL'] = df['CSSAL'].cat.rename_categories({1: 'Manœuvre, ouvrier spécialisé', 2: 'Ouvrier qualifié ou hautement qualifié, technicien d’atelier',
3: 'Technicien (non cadre)', 4 : 'Agent de catégorie B de la fonction publique',
5: 'Agent de maîtrise, maîtrise administrative ou commerciale, VRP', 6: 'Agent de catégorie A de la fonction publique',
7: 'Ingénieur, cadre d’entreprise', 8: 'Agent de catégorie C ou D de la fonction publique',
9: 'Employé (par exemple : de bureau, de commerce, de la restauration, de maison)'})
EMPL_labels = {3: 'Artisan, commerçant, industriel, travailleur indépendant', 4: 'Stagiaire rémunéré, apprenti sous contrat',
5: 'Salarié du secteur privé à durée déterminée', 6: 'Salarié du secteur privé à durée indéterminée',
7 : 'Salarié du secteur public à durée déterminée', 8: 'Salarié du secteur public à durée indéterminé', }
df['EMPL'] = df['EMPL'].astype("category")
df['DIPL'] = df['DIPL'].astype("category")
diplomes_libelles = {1: 'Pas de scolarisation', 2: 'Aucun diplôme mais scolarisation jusqu’en primaire', 3: 'Aucun diplôme mais scolarisation jusqu’au collège',
4: 'Aucun diplôme mais scolarisation au-delà du collège', 11: 'CEP' , 12: 'BEPC, brevet élémentaire, brevet des collèges, DNB' , 13: 'CAP, BEP ou diplôme de niveau équivalent',
14: 'Bac général ou technologique, brevet supérieur, capacité en droit, DAEU, ESEU',
15: 'Bac professionnel, brevet professionnel de technicien ou d’enseignement, diplôme équivalent',
16: 'BTS, DUT, Deug, Deust, diplôme de santé ou du social niveau bac + 2, diplôme équivalent',
17: 'Licence, Licence pro, maîtrise, diplôme équivalent de niveau bac + 3 ou bac + 4',
18: 'Master, DEA, diplôme grande école niveau bac + 5, doctorat de santé',
19: 'Doctorat de recherche (hors santé)'}
#df['DIPL'] = df['DIPL'].cat.rename_categories(diplomes_libelles)
#df['CS8'] = df['CS8'].astype("category")
#df['CS8'] = df['CS8'].cat.rename_categories({ : '',: '', : '',: '', : '', })
#df['CS8'] = df['CS8'].astype("category")
#df['CS8'] = df['CS8'].cat.rename_categories({ : '',: '', : '',: '', : '', })
df.head()
| ID | IDLOG | AGEA | AGER | ANNINS | APE | CNAT | COUPLE | CPAYSN | CPAYSRA | ... | STAT | STATANT | STM | TACT | TP | TRAANT | TRANS | TYP | TYPEMPL | TYPMENR | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 200296 | 85580.0 | 7 | 7 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | 6 | NaN | NaN | NaN | NaN | 2 | NaN | 3.0 |
| 1 | 200301 | 48282.0 | 46 | 46 | NaN | 2410Z | NaN | Vit en couple | NaN | NaN | ... | 3.0 | NaN | 2 | 1.0 | 1.0 | NaN | 4.0 | 2 | 1.0 | 3.0 |
| 2 | 200307 | 477.0 | 15 | 15 | NaN | NaN | NaN | ne vit pas en couple | NaN | NaN | ... | NaN | NaN | 6 | 3.0 | NaN | 2.0 | 5.0 | 2 | NaN | 3.0 |
| 3 | 200315 | 94323.0 | 40 | 40 | NaN | 8411Z | NaN | Vit en couple | NaN | NaN | ... | 3.0 | NaN | 3 | 1.0 | 1.0 | NaN | 4.0 | 2 | 1.0 | 3.0 |
| 4 | 200318 | 107640.0 | 68 | 68 | 1972.0 | NaN | NaN | Vit en couple | NaN | NaN | ... | NaN | 1.0 | 1 | 5.0 | NaN | 1.0 | 4.0 | 2 | NaN | 5.0 |
5 rows × 43 columns
df.describe()
| ID | IDLOG | AGEA | AGER | ANNINS | CNAT | CPAYSN | CPAYSRA | EXER | GAD | ... | STAT | STATANT | STM | TACT | TP | TRAANT | TRANS | TYP | TYPEMPL | TYPMENR | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 203144.000000 | 199929.000000 | 203144.000000 | 203144.000000 | 44082.000000 | 2412.000000 | 3137.00000 | 131.000000 | 89638.000000 | 203144.000000 | ... | 89155.000000 | 41032.000000 | 203144.000000 | 162368.000000 | 89638.000000 | 61332.000000 | 162852.000000 | 203144.000000 | 68942.000000 | 199844.000000 |
| mean | 135704.624434 | 54598.597797 | 35.574174 | 35.283617 | 1999.589651 | 432.129353 | 476.13803 | 454.458015 | 1.081428 | 3.085555 | ... | 2.755224 | 1.116494 | 3.879258 | 2.618632 | 1.114003 | 1.281729 | 3.701183 | 2.016245 | 1.252894 | 3.270726 |
| std | 78350.449683 | 31455.117174 | 21.826895 | 21.821489 | 17.898269 | 268.332958 | 76.04784 | 124.236639 | 0.328495 | 2.186296 | ... | 0.810840 | 0.379688 | 2.129446 | 1.953191 | 0.317817 | 0.449846 | 1.125026 | 0.126415 | 0.717101 | 1.215611 |
| min | 1.000000 | 2.000000 | 0.000000 | 0.000000 | 1927.000000 | 103.000000 | 127.00000 | 132.000000 | 1.000000 | 0.000000 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 1.000000 |
| 25% | 67790.500000 | 27344.000000 | 17.000000 | 17.000000 | 1988.000000 | 219.000000 | 501.00000 | 501.000000 | 1.000000 | 1.000000 | ... | 3.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 4.000000 | 2.000000 | 1.000000 | 3.000000 |
| 50% | 135765.000000 | 54590.000000 | 35.000000 | 34.000000 | 2006.000000 | 416.000000 | 514.00000 | 501.000000 | 1.000000 | 3.000000 | ... | 3.000000 | 1.000000 | 4.000000 | 1.000000 | 1.000000 | 1.000000 | 4.000000 | 2.000000 | 1.000000 | 3.000000 |
| 75% | 203519.250000 | 81991.000000 | 52.000000 | 51.000000 | 2015.000000 | 514.000000 | 514.00000 | 501.000000 | 1.000000 | 5.000000 | ... | 3.000000 | 1.000000 | 6.000000 | 5.000000 | 1.000000 | 2.000000 | 4.000000 | 2.000000 | 1.000000 | 4.000000 |
| max | 271406.000000 | 109025.000000 | 104.000000 | 104.000000 | 2019.000000 | 999.000000 | 514.00000 | 514.000000 | 3.000000 | 9.000000 | ... | 9.000000 | 3.000000 | 6.000000 | 7.000000 | 2.000000 | 2.000000 | 5.000000 | 3.000000 | 9.000000 | 5.000000 |
8 rows × 30 columns
from dataprep.eda import plot
plot(df)
Dataset Statistics
| Number of Variables | 43 |
|---|---|
| Number of Rows | 203144 |
| Missing Cells | 3.5524e+06 |
| Missing Cells (%) | 40.7% |
| Duplicate Rows | 0 |
| Duplicate Rows (%) | 0.0% |
| Total Size in Memory | 103.8 MB |
| Average Row Size in Memory | 535.9 B |
| Variable Types |
|
Dataset Insights
| ID is uniformly distributed | Uniform |
|---|---|
| AGEA and AGER have similar distributions | Similar Distribution |
| CNAT and CPAYSN have similar distributions | Similar Distribution |
| IDLOG has 3215 (1.58%) missing values | Missing |
| ANNINS has 159062 (78.3%) missing values | Missing |
| APE has 156869 (77.22%) missing values | Missing |
| CNAT has 200732 (98.81%) missing values | Missing |
| COUPLE has 40776 (20.07%) missing values | Missing |
| CPAYSN has 200007 (98.46%) missing values | Missing |
| CPAYSRA has 203013 (99.94%) missing values | Missing |
Dataset Insights
| CS24 has 128604 (63.31%) missing values | Missing |
|---|---|
| CS42 has 128604 (63.31%) missing values | Missing |
| CS8 has 128604 (63.31%) missing values | Missing |
| CSSAL has 140423 (69.12%) missing values | Missing |
| DIPL has 40776 (20.07%) missing values | Missing |
| EMPL has 113506 (55.87%) missing values | Missing |
| EXER has 113506 (55.87%) missing values | Missing |
| IRA has 13748 (6.77%) missing values | Missing |
| MINE has 199871 (98.39%) missing values | Missing |
| PROVRA has 13748 (6.77%) missing values | Missing |
Dataset Insights
| PROVTRA has 113506 (55.87%) missing values | Missing |
|---|---|
| RECH has 144665 (71.21%) missing values | Missing |
| SAL has 191662 (94.35%) missing values | Missing |
| SCOL has 26951 (13.27%) missing values | Missing |
| SECT10 has 113506 (55.87%) missing values | Missing |
| SECT21 has 113506 (55.87%) missing values | Missing |
| SECT5 has 113506 (55.87%) missing values | Missing |
| STAT has 113989 (56.11%) missing values | Missing |
| STATANT has 162112 (79.8%) missing values | Missing |
| TACT has 40776 (20.07%) missing values | Missing |
Dataset Insights
| TP has 113506 (55.87%) missing values | Missing |
|---|---|
| TRAANT has 141812 (69.81%) missing values | Missing |
| TRANS has 40292 (19.83%) missing values | Missing |
| TYPEMPL has 134202 (66.06%) missing values | Missing |
| TYPMENR has 3300 (1.62%) missing values | Missing |
| ANNINS is skewed | Skewed |
| CNAT is skewed | Skewed |
| CPAYSN is skewed | Skewed |
| GAD is skewed | Skewed |
| APE has a high cardinality: 369 distinct values | High Cardinality |
Dataset Insights
| MINE has constant value "1.0" | Constant |
|---|---|
| PROV has constant value "Sud" | Constant |
| APE has constant length 5 | Constant Length |
| CPAYSRA has constant length 5 | Constant Length |
| EMPL has constant length 3 | Constant Length |
| EXER has constant length 3 | Constant Length |
| GENRE has constant length 1 | Constant Length |
| ILN has constant length 1 | Constant Length |
| IRA has constant length 3 | Constant Length |
| MINE has constant length 3 | Constant Length |
Dataset Insights
| NAT has constant length 1 | Constant Length |
|---|---|
| PROV has constant length 3 | Constant Length |
| RECH has constant length 3 | Constant Length |
| SAL has constant length 3 | Constant Length |
| SCOL has constant length 3 | Constant Length |
| SECT10 has constant length 2 | Constant Length |
| SECT21 has constant length 1 | Constant Length |
| SECT5 has constant length 3 | Constant Length |
| STAT has constant length 3 | Constant Length |
| STATANT has constant length 3 | Constant Length |
Dataset Insights
| STM has constant length 1 | Constant Length |
|---|---|
| TACT has constant length 3 | Constant Length |
| TP has constant length 3 | Constant Length |
| TRAANT has constant length 3 | Constant Length |
| TRANS has constant length 3 | Constant Length |
| TYP has constant length 1 | Constant Length |
| TYPEMPL has constant length 3 | Constant Length |
| TYPMENR has constant length 3 | Constant Length |
| GAD has 28705 (14.13%) zeros | Zeros |
| GAQ has 13748 (6.77%) zeros | Zeros |
- 1
- 2
- 3
- 4
- 5
- 6
- 7
Number of plots per page:
#df2 = df[['ID', 'DIPL', 'EMPL']]
#df2 = df2.fillna(0)
#df2.head()
#df2 = df2.pivot_table('DIPL', 'EMPL', 'ID', aggfunc="sum")
#f, ax = plt.subplots(figsize=(9, 6))
#sns.heatmap(df2, annot=True, linewidths=.5, ax=ax)
df_age = df[['AGER', 'GENRE']]
df_homme = df_age.loc[df_age['GENRE'] == 1].groupby('AGER').sum()
df_homme['AGER'] = df_homme.index
print(df_homme.shape)
print(df_homme.loc[df_homme['AGER'] == 40])
df_homme['GENRE'] = 0-df_homme['GENRE']
df_homme = df_homme.rename(columns={'GENRE': 'homme'})
df_femme = df_age.loc[df_age['GENRE'] == 2].groupby('AGER').sum()
df_femme = df_femme.rename(columns={'GENRE': 'femme'})
df_femme['AGER'] = df_femme.index
print(df_femme.shape)
df_femme.loc[df_femme['AGER'] == 40]
(101, 2)
GENRE AGER
AGER
40 1349 40
(105, 2)
| femme | AGER | |
|---|---|---|
| AGER | ||
| 40 | 3094 | 40 |
Pyramide des ages¶
sns.set(font_scale = 2)
df4 = pd.concat([df_homme, df_femme], axis=1).iloc[::-1]
figure = plt.figure(figsize=(50, 50))
bar_plot = sns.barplot(x='homme', y=df4.index, data=df4, order=df4.index, lw=0, orient='horizontal')
bar_plot = sns.barplot(x='femme', y=df4.index, data=df4, order=df4.index, lw=0, orient='horizontal')
bar_plot.set(ylabel="Age", xlabel="Nombre de personnes", title = "Pyramide des âges")
plt.plot([0,0], [0,105], linewidth=2)
#sns;barplot(data=df_age, x=)
[<matplotlib.lines.Line2D at 0x7fb34caf38e0>]
Relation entre niveau de diplome, type d’emploi et catégorie socioprofessionnelle¶
df_metier = df[['DIPL', 'EMPL', 'CS8']].dropna()
sns.set(font_scale = 2)
fig, ax = plt.subplots(figsize=(50, 30))
df_counts = df_metier.groupby(['DIPL', 'EMPL']).size().reset_index()
df_counts.columns.values[df_counts.columns == 0] = 'count'
scale = 500*df_counts['count'].size
size = df_counts['count']/df_counts['count'].sum()*scale
#size = size.astype(float)
#sns.stripplot(x='DIPL', y='EMPL', hue='CS8', data=df_metier, ax=ax) #, size=size, sizes=(10,500)
dipl_id = [1, 2, 3, 4, 11, 12, 13, 14, 15, 16, 17, 18, 19]
dipl_lbl = ['Pas de scolarisation', 'Aucun diplôme mais scolarisation jusqu’en primaire', 'Aucun diplôme mais scolarisation jusqu’au collège',
'Aucun diplôme mais scolarisation au-delà du collège', 'CEP' , 'BEPC, brevet élémentaire, brevet des collèges, DNB' , 'CAP, BEP ou diplôme de niveau équivalent',
'Bac général ou technologique, brevet supérieur, capacité en droit, DAEU, ESEU',
'Bac professionnel, brevet professionnel de technicien ou d’enseignement, diplôme équivalent',
'BTS, DUT, Deug, Deust, diplôme de santé ou du social niveau bac + 2, diplôme équivalent',
'Licence, Licence pro, maîtrise, diplôme équivalent de niveau bac + 3 ou bac + 4',
'Master, DEA, diplôme grande école niveau bac + 5, doctorat de santé',
'Doctorat de recherche (hors santé)']
#plt.xticks(dipl_id, dipl_lbl, rotation=45, )
empl_lbl = ['Artisan, commerçant, industriel, travailleur indépendant', 'Stagiaire rémunéré, apprenti sous contrat',
'Salarié du secteur privé à durée déterminée', 'Salarié du secteur privé à durée indéterminée',
'Salarié du secteur public à durée déterminée', 'Salarié du secteur public à durée indéterminé']
from sklearn.preprocessing import OrdinalEncoder
import numpy as np
ord_enc = OrdinalEncoder()
enc_df = pd.DataFrame(ord_enc.fit_transform(df_metier), columns=list(df_metier.columns))
xnoise, ynoise = np.random.random(len(df_metier))/2, np.random.random(len(df_metier))/2
sns.scatterplot(enc_df["DIPL"]+xnoise, enc_df["EMPL"]+ynoise, alpha=0.5, hue=enc_df['CS8'], palette="hls")
plt.yticks(np.arange(0.25, len(empl_lbl)+0.25, 1), empl_lbl)
xrange = np.arange(0.25, len(dipl_lbl)+0.25, 1)
plt.xticks(xrange, dipl_lbl, rotation=90)
plt.legend(title='Categories socioprofessionnelles', loc='lower left', labels=['Agriculteurs exploitants', 'Artisans, commerçants et chefs d\'entreprise',
'Cadres et professions intellectuelles supérieures', 'Professions Intermédiaires',
'Employés', 'Ouvriers'])
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Input In [11], in <cell line: 27>()
20 #plt.xticks(dipl_id, dipl_lbl, rotation=45, )
22 empl_lbl = ['Artisan, commerçant, industriel, travailleur indépendant', 'Stagiaire rémunéré, apprenti sous contrat',
23 'Salarié du secteur privé à durée déterminée', 'Salarié du secteur privé à durée indéterminée',
24 'Salarié du secteur public à durée déterminée', 'Salarié du secteur public à durée indéterminé']
---> 27 from sklearn.preprocessing import OrdinalEncoder
28 import numpy as np
29 ord_enc = OrdinalEncoder()
ModuleNotFoundError: No module named 'sklearn'